Skip to main content
Chemistry LibreTexts

1.0: Variational Theory and the Variational Principle

  • Page ID
    20865
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    A very useful approximation method is known as the variational method. This is the basis of much of quantum chemistry, including Hartree-Fock theory, density functional theory, as well as variational quantum Monte Carlo. The underlying theorem of the method is the Ritz theorem, which states that, given a time-independent Hamiltonian, \(H\), with a set of eigenvalues, \(E_n\) and eigenvectors, \(\vert\psi_n\rangle\) satisfying

    \[ H\vert\psi_n\rangle = E_n\vert\psi_n\rangle \]

    then for any arbitrary ket vector \(\vert\psi\rangle\) in the Hilbert space, the expectation value of \(H\) in this ket must satisfy

    \[ \langle H\rangle \equiv {\langle \psi\vert H\vert\psi\rangle \over \langle \psi\vert\psi\rangle}\geq E_0 \]

    where \(E_0\) is the exact ground state energy. Equality only holds if

    \[ \vert\psi\rangle = \vert\psi_0\rangle \]

    The proof of the theorem is relatively simple. We expand \(\vert\psi\rangle\) in the eigenstates of \(H\):

    \[ \vert\psi\rangle = \sum_n C_n\vert\psi_n\rangle \]

    Then

    \[ \langle \psi\vert\psi\rangle = \sum_{m,n}C_n^*C_m \langle \psi_n\vert\psi_m\rangle = \sum_{n}\vert C_n\vert^2 \]

    and

    \[ \langle \psi\vert H\vert\psi\rangle = \sum_{m,n}C_n^*C_m\lan......\sum_{m,n}C_n^*C_mE_n\delta_{mn} = \sum_nE_n\vert C_n\vert^2 \]

    Therefore, the expectation value of \(H\) in the arbitrary ket vector is

    \begin{displaymath}
{\langle \psi\vert H\vert\psi\rangle \over \langle \psi\vert...
... } =
{\sum_nE_n\vert C_n\vert^2 \over \sum_n\vert C_n\vert^2}
\end{displaymath}

    Since \(\vert C_n\vert^2\geq 0\) and \(E_n\geq E_0\), it follows that

    \[ {\langle \psi\vert H\vert\psi\rangle \over \langle \psi\vert...... } ={\sum_nE_n\vert C_n\vert^2 \over \sum_n\vert C_n\vert^2} \]

    Therefore, we have

    \[ {\langle \psi\vert H\vert\psi\rangle \over \langle \psi\vert\psi\rangle}\geq E_0 \]

    It is also clear that equality can only hold if \(C_0=1\) and \(C_n=0\), \(n>0\), in which case,

    \[ \vert\psi\rangle = \vert\psi_0\rangle \]

    The conclusion is that \(E_0\) is, therefore, a lower bound on the on \(\langle H\rangle\), which means that we can approximate \(E_0\) by a minimization of \(\langle H\rangle\) with respect to any parameters that \(\vert\psi\rangle\) might depend on.

    Note that

    \[ \langle H \rangle ={\langle \psi\vert H\vert\psi\rangle \over \langle \psi\vert\psi\rangle} \]

    depends on all components of \(\vert\psi\rangle\). If we write the expectation values as integrals (in one-dimension, for example), then we see that

    \[ \langle H \rangle = {\int\;dx\;\psi^*(x)H\psi(x) \over \int\;dx\;\psi^*(x)\psi(x)} \]

    which shows that \(\langle H\rangle\) depends on all values of the function \(\psi(x)\), which is known as a trial wave function. We, therefore, call \(\langle H\rangle\) a functional of \(\psi(x)\). Loosely speaking, a functional is a function of a function. We, therefore, denote the variational functional as

    \[ E[\psi] ={\langle \psi\vert H\vert\psi\rangle \over \langle \psi\vert\psi\rangle} \]

    from which it follows that

    \(\displaystyle E[\psi]\) \(\textstyle \geq\) \(\displaystyle E_0\)
    \(\displaystyle E[\psi_0]\) \(\textstyle =\) \(\displaystyle E_0\)

    The functional character of \(E[\psi]\) can be used to derive another important property of the functional, which is the stationarity property around any eigenstate of \(H\). In order to derive the stationarity condition, we consider making a small variation of the trial ket according to

    \(\displaystyle \vert\psi\rangle\) \(\textstyle \longrightarrow\) \(\displaystyle \vert\psi\rangle + \vert\delta \psi\rangle\)
    \(\displaystyle \langle \psi\vert\) \(\textstyle \longrightarrow\) \(\displaystyle \langle \psi\vert + \langle \delta \psi\vert\)

    and we evaluate the functional \(E[\psi+\delta\psi]\):

    \[ E[\psi+\delta\psi] ={\left[\langle \psi\vert + \langle \d......t]\left[\vert\psi\rangle + \vert\delta \psi\rangle \right]} \]

    Now, we work to first order in \(\vert\delta \psi\rangle \) or \(\langle \delta \psi\vert\). Thus, we expand the functional:

    \begin{displaymath}
E[\psi+\delta\psi] = E[\psi] + {\partial E \over \partial \l...
...er \partial \vert\psi\rangle }\vert\delta \psi\rangle + \cdots
\end{displaymath}
    \[ E[\psi+\delta\psi] = E[\psi] + {\partial E \over \partial \l......er \partial \vert\psi\rangle }\vert\delta \psi\rangle + \cdots \]

    and the right side becomes

    \(\displaystyle {\langle \psi\vert H\psi\rangle + \langle \delta \psi\vert H\vert... ...gle \psi\vert\delta \psi\rangle + \langle \delta \psi\vert\psi\rangle + \cdots}\) \(\textstyle =\) \(\displaystyle {\langle \psi\vert H\psi\rangle + \langle \delta \psi\vert H\vert... ...{\langle \delta\psi\vert\psi\rangle \over \langle \psi\vert\psi\rangle }\right]\)
    \(\textstyle =\) \(\displaystyle {\langle \psi\vert H\vert\psi\rangle \over \langle \psi\vert\psi\... ...angle }{\langle \delta\psi\vert\psi\rangle \over \langle \psi\vert\psi\rangle }\)
    \(\textstyle =\) \(\displaystyle E[\psi] + {\langle \delta\psi\vert H\vert\psi\rangle \over \langl... ...E[\psi]{\langle \delta\psi\vert\psi\rangle \over \langle \psi\vert\psi\rangle }\)
    \(\textstyle =\) \(\displaystyle E[\psi] + \langle \delta \psi\vert \left[{H\vert\psi\rangle \over... ...vert E[\psi] \over \langle \psi\vert\psi\rangle } \right]\vert\delta\psi\rangle\)

    Now, comparing the left and right sides, we have

    \(\displaystyle {\partial E \over \partial \langle \psi\vert}\) \(\textstyle =\) \(\displaystyle {H\vert\psi \rangle \over \langle \psi\vert\psi\rangle } - {E[\psi]\vert\psi\rangle \over \langle \psi\vert\psi\rangle }\)
    \(\displaystyle {\partial E \over \partial \vert\psi\rangle }\) \(\textstyle =\) \(\displaystyle {\langle \psi\vert H \rangle \over \langle \psi\vert\psi\rangle } - {\langle \psi\vert E[\psi] \over \langle \psi\vert\psi\rangle }\)

    The stationarity condition is now obtained by setting the two first derivatives of \(E[\psi]\) to zero, which yields to conditions:

    \(\displaystyle H\vert\psi\rangle\) \(\textstyle =\) \(\displaystyle E[\psi]\vert\psi\rangle\)
    \(\displaystyle \langle \psi\vert H\) \(\textstyle =\) \(\displaystyle \langle \psi\vert E[\psi]\)

    which are equivalent, being simply adjoints of each other. Thus, the stationary condition is

    \[ H\vert\psi\rangle = E[\psi]\vert\psi\rangle \]

    which can be satisfied only if \(\vert\psi\rangle\) is an eigenvector of \(H\) with eigenvalue \(E[\psi]\). This suggests that any eigenvector of \(H\) can be found by searching the functional \(E[\psi]\) for extrema. Although possible, in principle, this is very difficult to implement in practice unless the dimensionality of the system is very low. However, if anyone were able to come up with an efficient algorithm for doing so, the variational theory guarantees that the process will yield the eigenvectors of \(H\).


    This page titled 1.0: Variational Theory and the Variational Principle is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Mark E. Tuckerman.